NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Health diagnosis associated with COVID-19 death in the United States: A retrospective cohort study using electronic health records

Joseph, Mariam; Li, Qiwei; Shin, Sunyoung (March 2025, PloS one)

Background The United States has experienced high surge in COVID-19 cases since the dawn of 2020. Identifying the types of diagnoses that pose a risk in leading COVID-19 death casualties will enable our community to obtain a better perspective in identifying the most vulnerable populations and enable these populations to implement better precautionary measures. Objective To identify demographic factors and health diagnosis codes that pose a high or a low risk to COVID-19 death from individual health record data sourced from the United States. Methods We used logistic regression models to analyze the top 500 health diagnosis codes and demographics that have been identified as being associated with COVID-19 death. Results Among 223,286 patients tested positive at least once, 218,831 (98%) patients were alive and 4,455 (2%) patients died during the duration of the study period. Through our logistic regression analysis, four demographic characteristics of patients; age, gender, race and region, were deemed to be associated with COVID-19 mortality. Patients from the West region of the United States: Alaska, Arizona, California, Colorado, Hawaii, Idaho, Montana, Nevada, New Mexico, Oregon, Utah, Washington, and Wyoming had the highest odds ratio of COVID-19 mortality across the United States. In terms of diagnoses, Complications mainly related to pregnancy (Adjusted Odds Ratio, OR:2.95; 95% Confidence Interval, CI:1.4 - 6.23) hold the highest odds ratio in influencing COVID-19 death followed by Other diseases of the respiratory system (OR:2.0; CI:1.84 – 2.18), Renal failure (OR:1.76; CI:1.61 – 1.93), Influenza and pneumonia (OR:1.53; CI:1.41 – 1.67), Other bacterial diseases (OR:1.45; CI:1.31 – 1.61), Coagulation defects, purpura and other hemorrhagic conditions(OR:1.37; CI:1.22 – 1.54), Injuries to the head (OR:1.27; CI:1.1 - 1.46), Mood [affective] disorders (OR:1.24; CI:1.12 – 1.36), Aplastic and other anemias (OR:1.22; CI:1.12 – 1.34), Chronic obstructive pulmonary disease and allied conditions (OR:1.18; CI:1.06 – 1.32), Other forms of heart disease (OR:1.18; CI:1.09 – 1.28), Infections of the skin and subcutaneous tissue (OR: 1.15; CI:1.04 – 1.27), Diabetes mellitus (OR:1.14; CI:1.03 – 1.26), and Other diseases of the urinary system (OR:1.12; CI:1.03 – 1.21). Conclusion We found demographic factors and medical conditions, including some novel ones which are associated with COVID-19 death. These findings can be used for clinical and public awareness and for future research purposes.
more » « less
Free, publicly-accessible full text available March 31, 2026
A regularized Bayesian Dirichlet-multinomial regression model for integrating single-cell-level omics and patient-level clinical study data

https://doi.org/10.1093/biomtc/ujaf005

Guo, Yanghong; Yu, Lei; Guo, Lei; Xu, Lin; Li, Qiwei (January 2025, Biometrics)

ABSTRACT The abundance of various cell types can vary significantly among patients with varying phenotypes and even those with the same phenotype. Recent scientific advancements provide mounting evidence that other clinical variables, such as age, gender, and lifestyle habits, can also influence the abundance of certain cell types. However, current methods for integrating single-cell-level omics data with clinical variables are inadequate. In this study, we propose a regularized Bayesian Dirichlet-multinomial regression framework to investigate the relationship between single-cell RNA sequencing data and patient-level clinical data. Additionally, the model employs a novel hierarchical tree structure to identify such relationships at different cell-type levels. Our model successfully uncovers significant associations between specific cell types and clinical variables across three distinct diseases: pulmonary fibrosis, COVID-19, and non-small cell lung cancer. This integrative analysis provides biological insights and could potentially inform clinical interventions for various diseases.
more » « less
BayeSMART: Bayesian clustering of multi-sample spatially resolved transcriptomics data

https://doi.org/10.1093/bib/bbae524

Guo, Yanghong; Zhu, Bencong; Tang, Chen; Rong, Ruichen; Ma, Ying; Xiao, Guanghua; Xu, Lin; Li, Qiwei (October 2024, Briefings in Bioinformatics)

Abstract The field of spatially resolved transcriptomics (SRT) has greatly advanced our understanding of cellular microenvironments by integrating spatial information with molecular data collected from multiple tissue sections or individuals. However, methods for multi-sample spatial clustering are lacking, and existing methods primarily rely on molecular information alone. This paper introduces BayeSMART, a Bayesian statistical method designed to identify spatial domains across multiple samples. BayeSMART leverages artificial intelligence (AI)-reconstructed single-cell level information from the paired histology images of multi-sample SRT datasets while simultaneously considering the spatial context of gene expression. The AI integration enables BayeSMART to effectively interpret the spatial domains. We conducted case studies using four datasets from various tissue types and SRT platforms, and compared BayeSMART with alternative multi-sample spatial clustering approaches and a number of state-of-the-art methods for single-sample SRT analysis, demonstrating that it surpasses existing methods in terms of clustering accuracy, interpretability, and computational efficiency. BayeSMART offers new insights into the spatial organization of cells in multi-sample SRT data.
more » « less
Bayesian nonparametric clustering with feature selection for spatially resolved transcriptomics data

https://doi.org/10.1214/25-AOAS2014

Zhu, Bencong; Hu, Guanyu; Xu, Lin; Fan, Xiaodan; Li, Qiwei (June 2025, The Annals of Applied Statistics)

Free, publicly-accessible full text available June 1, 2026
An interpretable Bayesian clustering approach with feature selection for analyzing spatially resolved transcriptomics data

https://doi.org/10.1093/biomtc/ujae066

Li, Huimin; Zhu, Bencong; Jiang, Xi; Guo, Lei; Xie, Yang; Xu, Lin; Li, Qiwei (July 2024, Biometrics)

ABSTRACT Recent breakthroughs in spatially resolved transcriptomics (SRT) technologies have enabled comprehensive molecular characterization at the spot or cellular level while preserving spatial information. Cells are the fundamental building blocks of tissues, organized into distinct yet connected components. Although many non-spatial and spatial clustering approaches have been used to partition the entire region into mutually exclusive spatial domains based on the SRT high-dimensional molecular profile, most require an ad hoc selection of less interpretable dimensional-reduction techniques. To overcome this challenge, we propose a zero-inflated negative binomial mixture model to cluster spots or cells based on their molecular profiles. To increase interpretability, we employ a feature selection mechanism to provide a low-dimensional summary of the SRT molecular profile in terms of discriminating genes that shed light on the clustering result. We further incorporate the SRT geospatial profile via a Markov random field prior. We demonstrate how this joint modeling strategy improves clustering accuracy, compared with alternative state-of-the-art approaches, through simulation studies and 3 real data applications.
more » « less
Bayesian hidden mark interaction model for detecting spatially variable genes in imaging-based spatially resolved transcriptomics data

https://doi.org/10.3389/fgene.2024.1356709

Yang, Jie; Jiang, Xi; Jin, Kevin Wang; Shin, Sunyoung; Li, Qiwei (April 2024, Frontiers in Genetics)

Recent technology breakthroughs in spatially resolved transcriptomics (SRT) have enabled the comprehensive molecular characterization of cells whilst preserving their spatial and gene expression contexts. One of the fundamental questions in analyzing SRT data is the identification of spatially variable genes whose expressions display spatially correlated patterns. Existing approaches are built upon either the Gaussian process-based model, which relies onad hockernels, or the energy-based Ising model, which requires gene expression to be measured on a lattice grid. To overcome these potential limitations, we developed a generalized energy-based framework to model gene expression measured from imaging-based SRT platforms, accommodating the irregular spatial distribution of measured cells. Our Bayesian model applies a zero-inflated negative binomial mixture model to dichotomize the raw count data, reducing noise. Additionally, we incorporate a geostatistical mark interaction model with a generalized energy function, where the interaction parameter is used to identify the spatial pattern. Auxiliary variable MCMC algorithms were employed to sample from the posterior distribution with an intractable normalizing constant. We demonstrated the strength of our method on both simulated and real data. Our simulation study showed that our method captured various spatial patterns with high accuracy; moreover, analysis of a seqFISH dataset and a STARmap dataset established that our proposed method is able to identify genes with novel and strong spatial patterns.
more » « less
Full Text Available
AI-Powered Bayesian Statistics in Biomedicine

https://doi.org/10.1007/s12561-023-09400-x

Li, Qiwei (December 2023, Statistics in Biosciences)

Full Text Available
iIMPACT: integrating image and molecular profiles for spatial transcriptomics analysis

https://doi.org/10.1186/s13059-024-03289-5

Jiang, Xi; Wang, Shidan; Guo, Lei; Zhu, Bencong; Wen, Zhuoyu; Jia, Liwei; Xu, Lin; Xiao, Guanghua; Li, Qiwei (June 2024, Genome Biology)

Abstract Current clustering analysis of spatial transcriptomics data primarily relies on molecular information and fails to fully exploit the morphological features present in histology images, leading to compromised accuracy and interpretability. To overcome these limitations, we have developed a multi-stage statistical method called iIMPACT. It identifies and defines histology-based spatial domains based on AI-reconstructed histology images and spatial context of gene expression measurements, and detects domain-specific differentially expressed genes. Through multiple case studies, we demonstrate iIMPACT outperforms existing methods in accuracy and interpretability and provides insights into the cellular spatial organization and landscape of functional genes within spatial transcriptomics data.
more » « less
Bayesian Landmark-Based Shape Analysis of Tumor Pathology Images

https://doi.org/10.1080/01621459.2023.2298031

Zhang, Cong; Bedi, Tejasv; Moon, Chul; Xie, Yang; Chen, Min; Li, Qiwei (February 2024, Journal of the American Statistical Association)

Full Text Available
Using persistent homology topological features to characterize medical images: Case studies on lung and brain cancers

https://doi.org/10.1214/22-AOAS1714

Moon, Chul; Li, Qiwei; Xiao, Guanghua (September 2023, The Annals of Applied Statistics)

Full Text Available

« Prev Next »

Search for: All records